An evolutionary scheme for Web Replication and Caching
نویسنده
چکیده
Design and implementation of e ective caching schemes has been a critical issue with respect to World Wide Web objects circulation and availability Caching and replica tion have been combined and applied in prototype systems in order to reduce the overall bandwidth and increase system s fault tolerance This paper presents a model for opti mizing access performance when requesting Web objects across distributed systems The replication and caching scheme is designed by the use of an evolutionary computation algorithm Cached data are considered as a population evolving over simulated time replicating the most prominent data to dedicated replication servers The simulation model is experimented and tested under cache traces provided by the Squid proxy cache server at the Aristotle University of Thessaloniki Cache hit rates and bytes hit length are reported showing that the proposed evolutionary mechanisms improve cache consistency and reliability Index terms World Wide Web caching Web replication and caching cache consistency evolutionary computation genetic algorithms Introduction Previous Work The continously rapid growth and worldwide expansion of the Internet has introduced new issues such as World Wide Web WWW tra c bandwidth insu ciency and distributed ob jects exchange Web caching has presented an e ective solution since it provides mechanisms to faster web access to improved load balancing and to reduced server load Cache e ciency depends on its content update frequency as well as on the algorithmic approach used to retain the cache content reliable and consistent Most web servers are reinforced with proxy cache servers which result in web objects coming closer to end users by adding speci c cache con sistency mechanisms and cache hierarchies Several approaches have been suggested for more e ective cache management and the problem of maintaining an updated cache has gained a lot of attention recently due to the fact that many web caches often fail to maintain a consistent cache Several techniques and frameworks have been proposed towards a more reliable and consistent cache infrastructure Cache consistency mechanisms have been included in almost every proxy cache server e g and their improvement became a major research issue In a survey of contemporary cache consistency mechanisms in Internet is presented and the introduction of trace driven simulation shows that a weak cache consistency protocol reduces network bandwidth and server load more than prior estimates of an objects life cycle or invalidation protocols The potential of document caching at the application level is discussed in to address the need for better resource management towards documents latency reduction The design and e ciency of a Cache hierarchy is a major issue in most proxy caches and in various research e orts The improvement in performance of Internet information systems supporting hierarchical proxy caches is argued since cache hierarchy is introduced in order to result in better systems scale In it is shown that hierarchical caching of ftp les could eliminate half of all le transfers whereas the e ciency of proxy cache operations is argued in under a distributed WWW cache by using election algorithms and an hierarchy similar to the xFS while in a low level simulation of a proxy cache considers further details as connection aborts in order to extend the high level metrics being used so far Caching and replication is discussed in where the performance of a proxy cache server is evaluated and validated Caching and replication have proved to be bene cial in both the circulation of web objects and the Web server s functionality The need for replication is discussed in where an alternative approach suggests the wide distribution of Internet load across multiple servers Furthermore prefetching and caching are techniques proposed to reduce latency in the Web Several bounds on the performance improvement seen from these techniques have been derived under speci c workloads The replication and caching methodology has raised a lot of research and implementation interest Some working groups and research teams have been established for a co ordinated replication and caching framework within the Internet community Evolutionary computation policies have been used to solve scienti c problems demanding optimization and adaptation to a changing environment The idea in these approaches is to evolve a population of candidate solutions to a given problem using operations inspired by natural genetic variation and natural selection expressed as survival of the ttest Usu ally grouped under the term evolutionary algorithms or evolutionary computation we nd the domains of genetic algorithms evolution strategies and genetic programming Genetic algo rithms GAs comprise one of the main evolutionary methods applied to many computational problems requiring either search through a huge number of possibilities for solutions or adap tation to a changing environment The innovation of GAs is that they work with a coding of the parameter set not the parameters themselves they search from a population of points and they use probabilistic transition rules More speci cally GAs have been applied in the areas of scienti c modeling and machine learning but recently there has been a growing interest in their application in other elds This paper presents a model based on an evolutionary computation approach in order to design and simulate an e ective Web replication and caching scheme The model is implemented by an algorithmic approach adapted to the Genetic algorithm process The implementation is based on the Squid proxy cache server speci cations for representing the Web objects as individuals to be cached and replicated The simulated model is experimented under real Squid cache traces and cache log les The contributions of the paper are twofold First a caching scheme is maintained by the use of evolution over a number of successive populations of cached objects Second replication is introduced to extend the caching scheme and the objects chosen for replication are identi ed by their preservation on the successive steps of the evolutionary scheme The remainder of the paper is organized as follows The next section describes Web proxy cache environments and various cache infrastructures with emphasis on the Squid proxy cache Section presents the design and structure of the replication and caching model which is based on evolutionary computation Section discusses the model s implementation details and operational functions whereas results from trace driven experimentation are presented in Section Results refer to cache hit rates byte hit lengths and le types hit rate Section points some conclusions and discusses potential future work
منابع مشابه
Improve Replica Placement in Content Distribution Networks with Hybrid Technique
The increased using of the Internet and its accelerated growth leads to reduced network bandwidth and the capacity of servers; therefore, the quality of Internet services is unacceptable for users while the efficient and effective delivery of content on the web has an important role to play in improving performance. Content distribution networks were introduced to address this issue. Replicatin...
متن کاملCombining replica placement and caching techniques in content distribution networks
Caching and replication have emerged as the two primary techniques for reducing the delay experienced by end-users when downloading web pages. Even though these techniques may benefit from each other, previous research work tends to focus on either one of them separately. In particular, caching has been studied mostly in the context of proxy server systems, while replication is the technology b...
متن کاملA Web-Based Evolutionary Model for Internet Data Caching
Caching is a standard solution to the problem of insuf cient bandwidth caused by the rapid increase of in formation circulation across the Internet Cache con sistency mechanisms are a crucial component of each cache scheme in uencing the cache usefulness and re liability This paper presents a model for optimizing Internet cache content by the use of a genetic algo rithm and examines the model b...
متن کاملA New Environment for Web Caching and Replication Study
Web Caching and replication have received considerable attention in the past years due to effectiveness in reducing client response time and network traffic. In this paper we describe a tool to improving the Web caching and replication study. An important difference with other tools is this tool try to take advantage of two tools very useful in caching and computer networks study, Proxycizer[Ga...
متن کاملWeb Caching and Replication
Introduction As the Internet has become an essential part of everyday life, hundreds of millions of users now connect to the Internet. At the same time, more resource-hungry and performance-sensitive applications have emerged. Expectations of scalability and performance have made caching and replication common features of the infrastructure of the Web. By directing the workload away from possib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005